A Theoretical Analysis of Model-Based Interval Estimation: Proofs

نویسندگان

  • Alexander L. Strehl
  • Michael L. Littman
چکیده

Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Interval Estimation (MBIE) learns efficiently in practice, effectively balancing exploration and exploitation. This paper presents the first theoretical analysis of MBIE, proving its efficiency even under worst-case conditions. The paper also introduces a new performance metric, average loss, and relates it to its less “online” cousins from the literature.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayes Interval Estimation on the Parameters of the Weibull Distribution for Complete and Censored Tests

A method for constructing confidence intervals on parameters of a continuous probability distribution is developed in this paper. The objective is to present a model for an uncertainty represented by parameters of a probability density function.  As an application, confidence intervals for the two parameters of the Weibull distribution along with their joint confidence interval are derived. The...

متن کامل

Interval Estimation for the Exponential Distribution under Progressive Type-II Censored Step-Stress Accelerated Life-Testing Model Based on Fisher Information

This paper, determines the confidence interval using the Fisher information under progressive type-II censoring for the k-step exponential step-stress accelerated life testing. We study the performance of these confidence intervals. Finally an example is given to illustrate the proposed procedures.

متن کامل

Sub-optimal Estimation of HIV Time-delay Model using State-Dependent Impulsive Observer with Time-varying Impulse Interval: Application to Continuous-time and Impulsive Inputs

Human Immunodeficiency Virus (HIV) weakens the immune system in confronting various diseases by attacking to CD4+T cells. In modeling HIV behavior, the number of CD4+T cells is considered as the output. But, continuous-time measurement of these cells is not possible in practice, and the measurement is only available at variable intervals that are several times bigger than sampling time. In this...

متن کامل

One-Sided Interval Trees

We give an alternative treatment and extension of some results of Itoh and Mahmoud on one-sided interval trees. The proofs are based on renewal theory, including a case with mixed multiplicative and additive renewals.

متن کامل

A confidence-aware interval-based trust model

It is a common and useful task in a web of trust to evaluate the trust value between two nodes using intermediate nodes. This technique is widely used when the source node has no experience of direct interaction with the target node, or the direct trust is not reliable enough by itself. If trust is used to support decision-making, it is important to have not only an accurate estimate of trust, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005